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1. Introduction 



Abstract. The aim of this work is twofold. First, we survey the techniques developed in [T] [2] to reconstruct the 
division (birth) rate from the cell volume distribution data in certain structured population models. Secondly, we 
implement such techniques on experimental cell volume distributions available in the literature so as to validate 
the theoretical and numerical results. As a proof of concept, we use the data reported in the classical work of 
Kubitschek concerning Escherichia coli in vitro experiments measured by means of a Coulter transducer- 
' multichannel analyzer system (Coulter Electronics, Inc., Hialeah, Fla, USA.) Despite the rather old measurement 

, technology, the reconstructed division rates still display potentially useful biological features. 
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' Structured growth models have been the subject of much attention for the last decades [HIS]- Amongst such models, 
' equal division ones play a crucial role for their relative simplicity. In equal division models, each cell of volume 2x 
subdivides into two cells of size x according to a certain rate B{2x). In order to understand better such models and 
, related microbiological mechanisms, it would be important to calibrate the birth rate B[-) by means of observed 
cell densities n{t,x), where t is the time parameter. Thus, in this article, we consider the problem of calibrating 
from measured data the following size-structured model for cell division: 

■§^n{t,x) + ■^[g{x)n{t,x)] + B{x)n{t,x) = AB{2x)n{t,2x), x,t > 0, 

g{x^O)n{t,x = 0) = 0,t>0, (1.1) 
n{0,x) = n"{x) > 0. 

The function g describes the microscopic growth behaviour of individual cells. 

Usually two different models have been considered. The first one consists of constant g, which without loss of 
generality we take to be one. This is usually called linear growth model and leads to characteristic lines of the form 
x{t) = XQ + t. The second one is g{x) — nx, and leads to characteristic lines of the form x{t) — xq exp{Kt) where n is 
a constant. It is called exponential growth model. Obviously both of them subsume some kind of unlimited growth 
of individual organisms. There has been substantial amount of debate in the literature concerning the correct way 
of modehng the function g. See for example [6l[7l[8] and references therein. 

Two macroscopic quantities of biological interest are naturally computed from the model (jl.ip . The total 
cell quantity J^{t) — n{t, x)dx and the total biomass M{t) = xn{t, x)dx . Integrating equation (jl.ip yields 
J\f{t) = /J^ B{x)n{t, x)dx . This means that number of cells increase only by division. Integrating the identity 
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function times (jl.ip yields M.{t) — g{x)n{t,x)dx . This means that the biomass increases only by nutrient 
uptake. 

Under reasonable assumptions on B, the long time behavior of n(t,x) is of the form n{t,x) w moN{x)e^°'^ . 
More precisely, it is proved in [9j (see also jTUJ [Tl] for the case g{x) = 1) that, under fairly general conditions on 
the coefficients, there is a unique solution {N, Xq) to the following eigenvalue problem 

^{gN) + {Xo + B{x))N = AB{2x)N{2x), x > 0, 
< gix = 0)N{x = 0) = 0, (1.2) 
N{x) > for a; > 0, N{x)dx = 1, 

where Aq > and x'^gN e L°° n for all a > 0. Furthermore, it was also shown in [9l[T0l[T2] that 

n{t,x)e-^"* r^z^ m,oN{x), \n L^{M.+ ,(j){x)dx), (1.3) 

where the weight 4> is the unique solution to the adjoint problem to Equation (jl.2[) . In other words, Aq is the growth 
rate of such a system and is usually called "Malthus parameter" in population biology. By integrating the equation 
and integrating it against the weight x, we also know that Aq is related to N by the relation 

Ao = y g{x)N{x)dx / y xNdx. (1.4) 

The plan for the article is the following: In Section [2l we discuss some biological preliminaries of the model 
under consideration. In Section [3] we describe the theoretical aspects of the inverse problem under consideration. 
This will give us the underlying theoretical basis for the calibration techniques. In Section [H we present the 
calibration results. In Section [5] we conclude with a discussion of the results and suggestions for further research. 



2. Biological Preliminaries 

In many situations it is important to understand when an organism is mature and ready to reproduce. The concept of 
size at maturation was developed as a theoretical tool to express how individuals deal with environmental variability 
within growth rates. In addition, individuals may have unequal fitness and might respond differently to the same 
environmental conditions. In this context, we can interpret the bacterial-growth experiments of Kubitschek [3] as 
a very special situation: (1) the cells grow in a chcmostat and all individuals have the same amount of nutrients, 
in all generations of the population. (2) the cells are clones from each other and have the same fitness, responding 
equally to the same environmental effects. (3) Kubitschek [3] also did a synchronic experiment, using temperature 
control techniques, so that all the cells would divide at the same time. 

One might argue that in this situation, with no diiference in the fitness of the individuals and in the environ- 
mental conditions, there should be almost no variability in the preferred size of division. However, the reconstruc- 
tions of B(x) detect an intrinsic variability that is inherent from the division process. This justifies the importance 
of the reconstructed division rate. Moreover, E. coli is commonly used as a model for more complex organisms and 
the present work can be considered a first step toward the modelling of more complex cell division phenomena. 

We now address the biological suitability of the mathematical model. The specific size-structured model under 
consideration in this work is suitable to microorganisms with the following characteristics: 

(a) All the cells of a given class of microorganisms grow according to some deterministic rule. 

(b) All cells, after attaining a sufficient size, divide into two, and only two, identical daughter cells. 

(c) The cell's age (generation) does not influence the growth law and the preferable size of division. 

(d) The total number of individuals of the population increases exponentially. 

(e) There is a negligible number of deaths during normal growth. 
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We will now argue that the statements above are reasonable for the bacteria E. Coli growing under certain 
conditions: 

Assumption (a) is in the spirit of ^ , where the author argues for the existence of a growth law of cells that 
can be found and understood. 

Perhaps the most delicate assumption is (b). The cell division process may result in daughters with unequal 
sizes. Equal division does not occur surely for all individuals of a microbiological population, but we may quantify 
how much they deviate from a perfect division by a statistical description, as explained in the work of Koch [7] . 

The bacteria E. Coli is known for its quite small variation in its subdivisions [71 [131 [H] • It is usually assumed 
(see [7]) that the fluctuations in the critical size-at-division of individual cells is random and not correlated with 
other cell cycle events in the current cell generation or in earlier generations. This justifies (c) for the bacteria E. 
Coli. 

It is well known that E. coli cells in normal growth conditions achieves an exponential phase [15] thus con- 
firming claim (d) in agreement with Eauation ll.31 

Finally, for statement (e), cell death during normal growth is considered by Koch [7J as a minor source 
of variability. It is true that some cells may be dead or moribund and may distort the distribution if they are 
clustered at some cell size range. But these effects can be neglected if there is plenty of nutrient, which occurs in 
the exponential phase of the bacterial growth. This hypothesis must be carefully revised when the population is 
submitted to starvation and other stressful types of experiments. The present model would not apply for instance, 
to the budding yeast S. cerevisae [16] . However, it could easily be adapted to take into account a known death rate. 



3. Methodology: Inverse problems and theoretical results 

Based on the limiting behavior of the distribution n{t, x) as t grows, the problem of calibration of B assuming 5=1 
reduces to the following: How can we estimate the division rate B from the knowledge of the steady dynamics N 
and Aq ? This corresponds to finding B a solution to 

d 

AB{2x)N{2x) ~ B{x)N{x) ^ L{x) -.^ —N{x) + XoN{x), x > 0, (3.1) 

assuming that {N,\o) is known, or, thanks to (|1.4p . that N is known. However, in practical applications we have 
only an approximate knowledge of (A^, Ao), given by noisy data {Ne,Xe), with Ne G _L^(IR+) for instance. This 
means that we have no way of controlling -^N^, so we cannot control the precision of a solution i?^ to problem 
(j3.ip when a perturbed replaces N. Thus, in the context of noisy data, the inverse problem under consideration 
is ill-posed and regularization is needed. 

As alluded above, denotes the measured stable distribution. The subindex refers to the difference in the 
appropriate norm to the value A^ associated to the true B of the model. The precise value of e is obviously unknown, 
but it depends on a number of factors such as accuracy of the measurements, quality of the model and proximity 
of the interpolation leading from a finite amount of data to the function with domain M_|_. 

Following [2\ three methods will be used, namely: quasi-reversibility, filtering, and a hybrid of these. 
Regularization by Quasi- Reversibility. To regularize the problem by the quasi-reversibility proposed in [I] we work 
with 

a:§^{B,M + 4B,,4y)Ar,(y) = i3,,„(f)Af,(f)-t-A.,„Ar,(f)+2^(^Af,(f)), y > 0, 
iB,,aN,){0) = 0. 
According to the eigenvalue theory [2], we have to choose 

A,,„ = ( / N,dx] ( I xN.dx + T / N<^dx] . (3.3) 



(3.2) 
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The method thus consists in solving numericaly the Equation (j3.2p . This was analyzed in detail in Section 3 of [2]. 
Regularization by Filtering. In this method we basicaly filter the data so as to smooth out the noisy component 
of the measurements. For a > 0, we introduce 

p^{x)^-p{-), peC^m, / p{x)dx^l, p>0, Supp(p) C [0,1], 

and we replace in (|3.1[) the term -^N^ + X^N^ by 

d d f°° d 

{jj-Ne + \e,aNt) * Paix) = iVe * (tTP" + ^<i,aPa)ix) = / Ne{x){—pa + \e,aPa){x - x)dx . 



We set TVj Q, := * p^- In this way, we obtain a smooth term in L^(R+). Furthermore, N^^^ converges to iV^ in 
when a tends to zero. We now have to find Be^a solution of 

d 

4B,,„(2a;)iVe,a(2a;) - B,^a{x)N,^aix) = ^^e,a + Ae,aiVe,a(a;), x>0 . (3.4) 

Once again, the numerical aspects of the solution of such problem are addressed in Section 3 of [5] . 
Convergence and the Hybrid Method. In [U [2] estimates are obtained that guarantee that for the choice of a = 
©(■v/e) the computed value of B^ a converges to B in the appropriate norms, with a speed of convergence in the 
order of y/e. 

In a number of numerical experiments with simulated data we found it convenient to use a hybrid method 
where we perform filtering and apply the quasi-reversibility technique. In the case of g(x) = kx, at present we 
do not have the theoretical and numerical results of [TJ [5]. However, a simple adaptation of our codes led to 
implementations that yield comparable results. We shall report on both cases. 



4. Reconstruction Results 

In this section we report the results of the reconstructions in two sets of data from [3]. We actually performed 
the numerical inversion in four sets, but wc chose two of them where the experimental set up seemed to fit better 
our model. As expected with real data, a number of difficulties have to be overcome. To cite a few: a) The data 
is obtained on a rather scarce set of discrete points, b) It clearly displays a substantial amount of noise. Yet, 
such noise is very hard to quantify, c) We are not sure whether the conditions for the validity of the model were 
indeed satisfied, and whether enough time ellapsed so as to guarantee a good approximation to stationary limiting 
distribution. 

We performed the reconstructions both for the linear and the exponential models. For each of the data sets, 
we followed the following steps (both for the linear model as well as for the exponential model): 

1. Transcribed the reported measurements from [2j to an array. 

2. Completed the boundary data for the volume x close to zero and large enough by setting the boundary 
conditions to zero (as required by the theory). 

3. Interpolated to a uniform grid on the region under consideration using splines. 

4. Deduced from (|1.4p and from the knowledge of both TV^ and Aq (given in the article [5] by the doubling times 
Tq — ) the constant value go — Ao ^j^^dx definition of g{x) — go (linear growth) or g{x) — X^x 
(exponential growth). 

5. Performed a search for a good regularization parameter a that would give a sufficiently small residual in the 
quasi-reversibility method of [T] and in the hybrid methods. The result was also validated by other heuristical 
criteria such as the use of the minimum of the ratio '^^"^"■^ [17] , see Figure [H 

6. Studied the behavior of the solutions to the inverse problem by varying the regularization parameter. 

7. Studied the consistency of the computed limiting distribution for the reconstructed B and the input data by 
computing the direct problem for some variants of the reconstructed B. 
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We performed a number of extensive tests and examples, some of which that were not reported here could be 
found in 18J. In all that follows we take the filter parameter to be a = 0.0001. We remark that this choice is very 
arbitrary, but it is based on the standard deviation of the volumes once we assume the instrumental resolution of 
the Coulton counter has a standard deviation of the order a = 0.03/im in the diameter [2. Thus, the standard 
deviation of the volume, assuming a normal distribution, is of the order of ay ~ (7r/6)vT5(T"^ w 10^*(/im)'^. 




Vol. in units of Vj^ Vol, in units of 



Figure 1. Input data from for doubling time of 20 min (left), reconstructed B for the linear 
model (center), reconstructed B for the exponential model (right). Vb = 1.36(/im)^. The value of 
a = 0.1 (larger a give lower peaks). 




Figure 2. Input data from [3] for doubling time of 54 min (left), reconstructed B for the linear 
model, reconstructed B for the exponential model (right). Vb — 0.61(//m)"^. The value of a = 0.2 
for the linear case, a = 0.1 for the exponential case. 



•o 0.4 




Figure 3. Example of criteria for the choice of a. Left: for a doubling time of 20 min, curve of 
restdMai t^]^]^ rcspcct to a, linear case. The minimum is attained for a = 0.4 but the curve is very 
flat for a > 0.2. Center: same curve for doubling time of 54 min, linear case, the minimum is still 
for a = 0.4. Right: as in Center, doubling time of 54 min and linear case, the curve of the residual 
exhibits a minimum for a = 0.2. 
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5. Discussion and Perspectives 

Most of the reconstructed birth rates we have obtained seem to display a global maximum close to twice the average 
cell volume. Furthermore, in all cases we have at least a local maximum in that neighborhood. The precise location, 
however, varies substantially with the doubling time of the E. coli experiments, which in turn is a function of 
the experimental conditions. The peak locations also vary a little with the inversion methods and regularization 
parameters. 

The presence of doublets and triplets in the experimental data may give rise to artifacts for values of x higher 
than 3. Indeed, it is not clear, due to the highly nonlinear characteristic of the reconstruction whether an incorrect 
increase in the measurements near a certain a;o would lead to an increase in the reconstructed value of B{xq). This 
is a point that might be investigated further both theoretically and numerically. 

We cannot infer any conclusion on the discussion of whether individual cells undergo exponential or linear 
volume growth. Still, a superficial observation of the results seems to indicate that exponential growth gives smaller 
residuals. This calls for a more extensive testing on better quality experimental data. 

The optimal choice of the regularization parameter a should be investigated further. In this work we took a 
rather naive choice due to lack of a good estimates on the experimental noise. Indeed, we tried to locate the optimal 
choice by looking at the curve of the residuals as a function of a. A natural extension of this work would be a more 
detailed investigation of a priori and a posteriori choices for a, such as the quasi-optimality criterion [19j or the 
L-curve method [20] . 
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